Draft genome sequences for ten strains of Xanthomonas species that have phylogenomic importance

Here we report draft-quality genome sequences for pathotype strains of eight plant-pathogenic bacterial pathovars: Xanthomonas campestris pv. asclepiadis, X. campestris pv. cannae, X. campestris pv. esculenti, X. campestris pv. nigromaculans, X. campestris pv. parthenii, X. campestris pv. phormiicola, X. campestris pv. zinniae and X. dyei pv. eucalypti (= X. campestris pv. eucalypti). We also sequenced the type strain of species X. melonis and the unclassified Xanthomonas strain NCPPB 1067. These data will be useful for phylogenomic and taxonomic studies, filling some important gaps in sequence coverage of Xanthomonas phylogenetic diversity. We include representatives of previously under-sequenced pathovars and species-level clades. Furthermore, these genome sequences may be useful in elucidating the molecular basis for important phenotypes, such as biosynthesis of coronatine-related toxins and degradation of fungal toxin cercosporin.


INTRODUCTION
The genus Xanthomonas is comprised of Gram-negative bacteria that are usually associated with plants and it includes causative agents of economically important disease of crops such as brassicas, rice, cassava, banana and tomatoes [1]. The taxonomy of Xanthomonas spp. has a long and sometimes confusing history. The List of Prokaryotic names with Standing in Nomenclature (accessed 3 April 2023) lists 35 validly named species for the genus, plus several synonyms and others that have not been validly published [2]. Many Xanthomonas species are further divided into intra-specific groups called pathovars, which are defined primarily by their host range and in some cases also by biochemical or physiological differences [3].

ACCESS
Recent efforts have attempted to reconcile the taxonomy of Xanthomonas species and pathovars with their phylogeny, often informed by sequence data from one or several genes or from genome-wide sequence data; this has led to the proposal of new species [4][5][6][7][8] and the transfer of pathovars from one species to another [9][10][11][12][13][14][15][16][17][18][19]. Analysis of partial DNA sequences of the gyrB gene suggested that many pathovars of the species X. campestris are phylogenetically not closely related to the type strain of this species; they are much closer to other named species or so-called species-level clades (slc) [20].
Although partial gyrB gene sequences are available for many of these taxonomically incongruent strains, genome-wide sequence data would facilitate phylogenomic studies at a higher resolution. Here, we present draft-quality genome sequences for strains that nominally belong to X. campestris, but whose inclusion in that species is incongruent with phylogeny [6,[20][21][22][23]; evolutionarily, they fall within species 'X. cannabis', X. dyei, X. hortorum, X. melonis, X. sacchari, Slc 4 and Slc 6. Furthermore, we sequenced strain NCPPB 4013 [24], representing a putative new species, and the type strain of X. melonis, thereby filling some gaps in the resources for phylogenomics of Xanthomonas.

RESULTS AND DISCUSSION
We assembled draft genome sequences de novo from short sequence reads for ten strains of Xanthomonas species and deposited these in GenBank under the accession numbers listed in Table 1. These strains include the type strain for X. melonis, pathotype strains for eight pathovars of X. campestris and one strain (from radish) not assigned to any named species. For nine of the ten newly sequenced genomes, partial gyrB sequences are available under GenBank accessions EU007531.1, EU285211.1, EU285210.1, EU285202.1, EU285197.1, EU285181.1, EU285180.1, EU285177.1 and EU285064.1. In all cases, blastn searches confirmed 100 % nucleotide sequence identity over the full length of the partial gyrB gene versus the corresponding genome sequence assembly. CheckM estimated contamination levels at below 5 % for all assemblies except for that of X. campestris pv. nigromaculans (Tables 1  and S1), available in the online version of this article. Pairwise ANI values are tabulated in Table S2.

'X. cannabis'
The species 'X. cannabis' was proposed to include X. campestris pv. cannabis, and X. sp. strain Nyagatare isolated from beans [37,38], though according to The List of Prokaryotic names with Standing in Nomenclature [2] (accessed 29 March 2023) this name is not yet validly published. Originally, X. cannabis was proposed in 1955 when Okabe and Goto transferred Pseudomonas cannabis [39] into the genus Xanthomonas [40]; however, no type strain was deposited. In 2014, Netsu and colleagues [41] speculated that a Japanese isolate identified as X. campestris pv. cannabis might represent the same pathogen [39] previously described as 'P. cannabis' and 'X. cannabis' .
A description of leaf spot on Zinnia plants in Zimbabwe [42] proposed the causative pathogen to be a forma specialis of X. nigromaculans, where X. nigromaculans was the pathogen responsible for leafspot on burdock [22] now known as X. campestris pv. nigromaculans and phylogenetically falling within X. hortorum. Subsequently, this Zinnia pathogen was renamed X. campestris pv. zinniae [3]. Previous analyses based on the gyrB locus [20] and our phylogenomic tree (Fig. 1) both place the Zinnia pathogen as completely distinct from X. campestris pv. nigromaculans and X. hortorum. Rather, it falls within the clade corresponding to 'X. cannabis' and Slc 1.
In addition to its significance as a pathogen in Africa [42], Asia [43,44] and Europe [45], X. campestris pv. zinniae is of interest for its ability to degrade the toxin cercosporin that is produced by phytopathogenic fungi of the genus Cercosporus [46][47][48][49]. The search for genes encoding the degradation pathway identified an oxidoreductase and a putative transcriptional regulator but also highlighted that additional factors are required for cercosporin degradation [49]. Availability of this draft genome sequence may enable identification of additional genes that could facilitate the engineering of Cercosporus-resistant crop plants [49].

X. dyei
Strain NCPPB 2337 causes dieback on Eucalyptus citriodora in Australia and was originally described as a new species: X. eucalypti [50]. In the major taxonomic revision that saw many species reclassified as pathovars of X. campestris, this pathogen was renamed as X. campestris pv. eucalypti [3], with NCPPB 2337 designated as the pathotype strain. In Parkinson and colleagues' gyrB-based phylogenetic analysis [20], it was placed within Slc 2 along with unclassified xanthomonads isolated from Lobelia spp. and with X. campestris pv. laureliae isolated from Laurelia novae-zelandiae. Following multi-locus sequence analysis (MLSA), this slc was assigned the species name X. dyei, so that NCPPB 2337 became the pathotype strain of X. dyei pv. eucalypti [12]. Consistent with this, our phylogenomic analysis places X. dyei pv. eucalypti close to the type strain of X. dyei (Fig. 1), which was isolated from Metrosideros excelsa [12]. Values of ANI and dDDH were 96.86 % and 93.9 %, respectively.
Xanthomonas strain NCPPB 1067 was isolated from Raphanus sativus (radish) in Vanuatu. In the previous gyrB-based phylogenetic analysis [20], it was the sole representative of Slc 3. Our phylogenomic analysis places it close to the type strain of X. melonis ( Fig. 1), with which it shares 98.57 % ANI and 86.80 % dDDH, consistent with its belonging to this species and with Parkinson's Slc corresponding to the species X. melonis. Table 1. Bacterial strains used for genome sequencing. Contamination levels were assessed using CheckM [36]. Additional assembly metrics are provided in the  Phylogenetic tree, based on core-genome sequences, for the newly sequenced strains, type strains and representative strains of Xanthomonas spp., generated using PhaME [28] and FastTree [29]. The tree was graphically rendered using the Interactive Tree of Life [67]. Configuration and tree files are available from https://github.com/davidjstudholme/phylogenomics-Xanthomonas-1. Accession numbers and references for the genome assemblies are listed in Table 2. Newly sequenced genomes are indicated with a black star.  [20]. In our phylogenomic analysis (Fig. 1 In 1933, Takimoto [55] described Bacterium phormicola as the cause of bacterial streak on New Zealand flax (Phormium tenax). This pathogen was subsequently classified as X. campestris pv. phormiicola [3], though Parkinson and colleagues' later examination of gyrB sequences [20] placed it as the sole member of Slc 6, outside of X. campestris. Our genome-based phylogenetic analysis of NCPPB 2983 confirmed this (Fig. 1) and placed it as a distinct species close to X. hyacinthi and X. translucens [56]. It shares 95 % ANI with X. bonasiae FX4 (Table S2), which is on the borderline of the commonly used criterion for species delimitation.   [56].
Xanthomonas campestris pv. phormiicola is unusual among xanthomonads in that it produces the phytotoxin coronatine [57] and/ or related derivatives of coronofacic acid [49], a trait more usually associated with Pseudomonas spp. Tamura and colleagues [57] hypothesized that phytotoxin biosynthesis might be encoded on a plasmid but failed to isolate plasmid DNA. The availability of genome sequence opens the possibility of identifying the genetic basis for this trait. After our submission of the first version of this manuscript, a high-quality genome sequence assembly was published for Xanthomonas campestris pv. phormiicola and candidate biosynthesis genes were identified [56].

X. campestris pv. asclepiadis: a potential new species
Bacterial blight of milkweed (Asclepias spp.) is attributed to X. campestris pv. asclepiadis [24]. No sequence data was available for this pathogen in GenBank and therefore its phylogenetic position within the genus was unclear. Our analysis places the pathotype strain NCPPB 4013 very distant from the type strain of X. campestris, and closer to X. euroxanthea (Fig. 1), a species isolated from walnut trees (Juglans regia) [58]. The ANI between NCPPB 4013 and the X. euroxanthea type strain is 95.97 %, close to the threshold of 95-96 % that is commonly used to delineate species boundaries [30,[59][60][61][62][63][64]. The Type Strain Genome Server, calculating a dDDH value of 79. % with the X. euroxanthea type strain, reports that NCPPB 4013 does not belong to X. euroxanthea and may represent a new species.

X. campestris pv. cannae (X. sacchari)
The pathogen responsible for a bacterial leaf spot and leaf blight on canna (Canna x generalis) in India was described as X. campestris pv. cannae [65], though phylogenetic analysis of its gyrB gene sequence places the pathotype strain NCPPB 4345 in the X. sacchari clade [20]. Our phylogenomic analysis (Fig. 1) confirms a close relationship between NCPPB 4345 and the type strain of X. sacchari. However, the current species description for X. sacchari [16] states that 'The strains of this species are isolated from diseased sugarcane' and therefore would exclude this pathogen, which was isolated from canna rather than sugarcane. Currently, X. sacchari is not subdivided into pathovars, since there is no variation in the host range. This species is distinguished from other xanthomonads by its metabolic activity on a long list of carbon substrates [16], only a subset of which have been tested on X. campestris pv. cannae. Transfer of this pathovar into X. sacchari would require revision of its species description to encompass isolation hosts beyond sugarcane.

X. campestris pv. nigromaculans
The causative agent of black spot on burdock (Arctium lappa) was originally described in 1927 under the name Bacterium nigromaculans [22] and subsequently renamed as X. campestris pv. nigromaculans [3]. Several authors have proposed, based on sequence-based phylogenetic analysis, that X. campestris pv. nigromaculans is more closely related to the species X. hortorum [6,9,10,20,21] than to the type strain of X. campestris. This proposal was initially based on just a single genetic locus, gyrB [20]. Later, Dia and colleagues arrived at the same conclusion on the basis of both MLSA and phylogenomic analysis of the core genome, but they did not make their X. campestris pv. nigromaculans genome sequence publicly available at that time [6], motivating our sequencing of its pathotype strain NCPPB 1935. Subsequently, Dia and colleagues publicly deposited their sequence data for this same strain [21]. Our phylogenomic analysis (Fig. 1) is consistent with the previous studies, placing this strain clearly into Dia's sub-cluster A [6] within the X. hortorum clade, close to X. hortorum pv. gardneri and X. hortorum pv. cynarae.
In summary, our genome sequencing and phylogenetic reconstruction supports previous suggestions that X. campestris pv. nigromaculans falls within a clade that corresponds to the species X. hortorum [6,9,10,20,21] and should be transferred to that species. We also note that our genome assembly for X. campestris pv. nigromaculans has a rather high level of contamination of 12.92 % whereas assembly GCA_938743425.1 [21] is of higher quality, with a contamination level of 0.81 % (Tables 1  and S1).

Conclusion
Here we present draft-quality genome sequences for ten plant-pathogenic bacterial strains from the National Collection of Plant Pathogenic Bacteria (NCPPB). The data will be useful for phylogenomic studies, filling some important gaps in sequence coverage of the Xanthomonas phylogenetic diversity. Furthermore, these genome sequences may be useful in elucidating the molecular basis for important phenotypes such as biosynthesis of coronatine-related toxins and degradation of fungal toxin cercosporin. Peer review history assembly for Xanthomonas campestrispv. badriiNEB122 PT ; this strain represents Slc4. To the best of my knowledge, there are still no genome sequences available for Slc5. Shortly after we submitted the original manuscript, we discovered that Peduzzi and colleagues have recently sequenced CFBP 8444, which belongs to Slc6. So, we deleted this misleading sentence, and reorganised this section of the text for improved clarity. We also added these previously omitted genome assemblies in the revised version of Figure 1 and added short comments to this effect in the text. Thank you for alerting us to this gap in my knowledge of the literature. We have now updated the text to reflect this narrative of the history of the taxonomic name "X. cannabis" and cited those references.

(v) L98: esculenti in italics
This is now resolved.
Yes, at line 164 we observed that the ANI between asclepiadisNCPPB 4013 and X. euroxantheais 95.97%, which is rather borderline in respect of species delineation. The Type (Strain) Genome Server, TYGS, implements includes the tool recommended by the reviewer. TYGS does not consider that NCPPB 4013 belongs to X. euroxantheaand suggests that it is a potential new species: This is based on the following dDDH values: (viii) L177/178: Please provide ANI and dDDH values when suggesting to incorporate "X. sontii" in the species X. sacchari.
On reflection, the case for incorporating X. sontiiinto X. sacchariis very weak and so we have deleted this suggestion.
This phrase was poorly worded. We changed it to: "clade that corresponds to the species X. hortorum". We have now fixed multiple formatting problems with the references. Yes, I can appreciate that the emphasis on slc is potentially confusing. As the reviewer suggests, we have removed these from most of the headings (except where the slc does not correspond to a named species and is therefore a useful label) and have made some small changes to the text to put slightly less emphasis on these.
I believe the manuscript would benefit from including average nucleotide identities throughout the manuscript and in table format.
We have now generated a table of ANIs for all genome sequence assemblies mentioned in this manuscript. This is a very large table and so we include it as supplementary Excel spreadsheet file, in which the ANIs are coloured on a white-to-red scale.
The authors point to >96% ANI but >95% is ofter referred to as the cut-off for species.

X cannabis is not a species and apparently the article by Jacobs et al. is not sufficient for species status.
It is indeed true that "X. cannabis" is not validly published according to List of Prokaryotic names with Standing in Nomenclature https://lpsn.dsmz.de/genus/xanthomonas. We state this in the manuscript. Also, this is one reason why we have to mention Slc 1, since this informal grouping corresponds closely to the entity described (invalidly) in the literature as "X. cannabis". Biologically speaking, it probably is a species.

Reviewer 3
DNA gyrase gene sequence is available for all the strains sequenced in the study. The authors need to make sure that gyrase gene sequence is identical to the corresponding sequence in the genome of that particular strai Yes, indeed we checked this. Partial gyrBsequences are available in the databases for 9 out of the 10 sequenced strains. We have prepared a PowerPoint file containing screenshots of BLAST searches that conform 100% identity over the full lengths of the partial gyrBsequences for all 9 sequenced genomes. "We now state in the manuscript: For nine of the ten newly sequenced genomes, there are available partial gyrBsequences (GenBank accessions: EU007531.  We have now italicised the species names in the Figure. For completeness and transparency, we also added accession numbers and citations for the genome assemblies in Figure 1; these are found in the new Table 2. Table 1: Adding genome assembly statistics such as genome coverage and contamination, N50 value will make the genome assembly more reliable.
We have now compiled a table with the full results from CheckM analysis of the genome assemblies in a supplementary Excel file. We also added coverage and contamination to Table 1 and mentioned the high level of contamination in one assembly in the main text.
Line 108, Line 124, Line 151: Provide ANI data and dDDH values to strengthen the statement.
Our revised text now reads: "… analysis of NCPPB 2983 confirmed this and placed it as a distinct species close to X. hyacinthiand X. translucens(55) (Figure 1). It shares 95 % ANI with X. bonasiaeFX4 (Supplementary Table S2), which is on the borderline of the commonly used criterion for species delimitation. Based on dDDH values, the Type Strain Genome Server identifies that this genome does not fall within any named species and potentially represents a new species. " Provide ANI values in a table or heatmap format to make the paper more comprehensible to the reader.
We have now generated a table of ANIs for all genome sequence assemblies mentioned in this manuscript. This is a large table and so we include it as supplementary Excel spreadsheet file, in which the ANIs are coloured on a white-to-red scale.
Further, calculation of digital DNA-DNA hybridization values might help in the taxonomic refinement of these strains along with phylogeny and ANI.
Where useful to do so, we added results of dDDH as calculated by the Type Strain Genome Server and also mentioned this in the Methods section.
Line 162: There are more genomes available for X. euroxanthea in the NCBI GenBank database. Adding these genomes to the phylogeny might clear whether NCPPB 4013 falls amongst the X. euroxanthea clade or in close association with them.
We have now included all available X. euroxantheagenome assemblies in Figure 1 and in the ANI calculations in the supplementary Excel file.
Line 177: The authors need to provide ANI and dDDH values of X. campestris pv. cannae with X. sontii and X.sacchari to support their statement that "There is also a case to incorporate "X. sontii" into this species, given its close phylogenetic affinity" On reflection, the case for incorporating X. sontiiinto X. sacchariis very weak and so we have deleted this suggestion.
I hope this newly improved manuscript is now acceptable for publication. Comments: Overall, the study was well received. From my understanding of the reviewer's comments, most highlighted a need to calculating dDDH and visualise ANI results in a heatmap. The others should be relatively minor to address (ie. spelling/phrasing/ formatting). This is a study that would be of interest to the field and community. The reviewers have highlighted minor concerns with the work presented. Please ensure that you address their comments.

Reviewer 3 recommendation and comments
https://doi.org/10.1099/acmi.0.000532.v1.4 © 2023 Anonymous. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License.

Anonymous.
Date report received: 01 February 2023 Recommendation: Major Revision Comments: 1. Methodological rigour, reproducibility and availability of underlying data -2. Presentation of results 3. How the style and organization of the paper communicates and represents key findings 4. Literature analysis or discussion 5. Any other relevant comments The study/resource is important and timely for the community working in the area of plant pathogens in general and Xanthomonas in particular. Below are the major comments that authors need to address to strengthen the manuscript DNA gyrase gene sequence is available for all the strains sequenced in the study. The authors need to make sure that gyrase gene sequence is identical to the corresponding sequence in the genome of that particular strai Line 49-34 valid species available. Figure 1: Species names in the phylogenetic tree are not in italics Table 1: Adding genome assembly statistics such as genome coverage and contamination, N50 value will make the genome assembly more reliable. Line 108, Line 124, Line 151: Provide ANI data and dDDH values to strengthen the statement. Provide ANI values in a table or heatmap format to make the paper more comprehensible to the reader. Further, calculation of digital DNA-DNA hybridization values might help in the taxonomic refinement of these strains along with phylogeny and ANI. Line 162: There are more genomes available for X. euroxanthea in the NCBI GenBank database. Adding these genomes to the phylogeny might clear whether NCPPB 4013 falls amongst the X. euroxanthea clade or in close association with them. Line 177: The authors need to provide ANI and dDDH values of X. campestris pv. cannae with X. sontii and X.sacchari to support their statement that "There is also a case to incorporate "X. sontii" into this species, given its close phylogenetic affinity"